input output example
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report (0.94)
- Workflow (0.93)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report (0.94)
- Workflow (0.93)
Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs
Cummins, Chris, Seeker, Volker, Armengol-Estapé, Jordi, Markosyan, Aram H., Synnaeve, Gabriel, Leather, Hugh
Tools for rewriting, refactoring and optimizing code should be fast and correct. Large language models (LLMs), by their nature, possess neither of these qualities. Yet, there remains tremendous opportunity in using LLMs to improve code. We explore the use of LLMs not to transform code, but to code transforms. We propose a chain-of-thought approach to synthesizing code transformations from a small number of input/output code examples that incorporates execution and feedback. Unlike the direct rewrite approach, LLM-generated transformations are easy to inspect, debug, and validate. The logic of the rewrite is explicitly coded and easy to adapt. The compute required to run code transformations is minute compared to that of LLM rewriting. We test our approach on 16 Python code transformations and find that LLM- generated transforms are perfectly precise for 7 of them and less imprecise than direct LLM rewriting on the others. We hope to encourage further research to improving the precision of LLM code rewriting.
TF-Coder: Program Synthesis for Tensor Manipulations
Shi, Kensen, Bieber, David, Singh, Rishabh
The success and popularity of deep learning is on the rise, partially due to powerful deep learning frameworks such as TensorFlow and PyTorch that make it easier to develop deep learning models. However, these libraries also come with steep learning curves, since programming in these frameworks is quite different from traditional imperative programming with explicit loops and conditionals. In this work, we present a tool called TF-Coder for programming by example in TensorFlow. TF-Coder uses a bottom-up weighted enumerative search, with value-based pruning of equivalent expressions and flexible type- and value-based filtering to ensure that expressions adhere to various requirements imposed by the TensorFlow library. We also train models that predict TensorFlow operations from features of the input and output tensors and natural language descriptions of tasks, and use the models to prioritize relevant operations during the search. TF-Coder solves 63 of 70 real-world tasks within 5 minutes, often finding solutions that are simpler than those written by TensorFlow experts.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (16 more...)
Improving Neural Program Synthesis with Inferred Execution Traces
Shin, Richard, Polosukhin, Illia, Song, Dawn
The task of program synthesis, or automatically generating programs that are consistent with a provided specification, remains a challenging task in artificial intelligence. As in other fields of AI, deep learning-based end-to-end approaches have made great advances in program synthesis. However, more so than other fields such as computer vision, program synthesis provides greater opportunities to explicitly exploit structured information such as execution traces, which contain a superset of the information input/output pairs. While they are highly useful for program synthesis, as execution traces are more difficult to obtain than input/output pairs, we use the insight that we can split the process into two parts: infer the trace from the input/output example, then infer the program from the trace. This simple modification leads to state-of-the-art results in program synthesis in the Karel domain, improving accuracy to 81.3% from the 77.12% of prior work.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report (0.94)
- Workflow (0.93)
Improving Neural Program Synthesis with Inferred Execution Traces
Shin, Richard, Polosukhin, Illia, Song, Dawn
The task of program synthesis, or automatically generating programs that are consistent with a provided specification, remains a challenging task in artificial intelligence. As in other fields of AI, deep learning-based end-to-end approaches have made great advances in program synthesis. However, more so than other fields such as computer vision, program synthesis provides greater opportunities to explicitly exploit structured information such as execution traces, which contain a superset of the information input/output pairs. While they are highly useful for program synthesis, as execution traces are more difficult to obtain than input/output pairs, we use the insight that we can split the process into two parts: infer the trace from the input/output example, then infer the program from the trace. This simple modification leads to state-of-the-art results in program synthesis in the Karel domain, improving accuracy to 81.3% from the 77.12% of prior work.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report (0.94)
- Workflow (0.93)
Stepping Stones to Inductive Synthesis of Low-Level Looping Programs
Inductive program synthesis, from input/output examples, can provide an opportunity to automatically create programs from scratch without presupposing the algorithmic form of the solution. For induction of general programs with loops (as opposed to loop-free programs, or synthesis for domain-specific languages), the state of the art is at the level of introductory programming assignments. Most problems that require algorithmic subtlety, such as fast sorting, have remained out of reach without the benefit of significant problem-specific background knowledge. A key challenge is to identify cues that are available to guide search towards correct looping programs. We present MAKESPEARE, a simple delayed-acceptance hillclimbing method that synthesizes low-level looping programs from input/output examples. During search, delayed acceptance bypasses small gains to identify significantly-improved stepping stone programs that tend to generalize and enable further progress. The method performs well on a set of established benchmarks, and succeeds on the previously unsolved "Collatz Numbers" program synthesis problem. Additional benchmarks include the problem of rapidly sorting integer arrays, in which we observe the emergence of comb sort (a Shell sort variant that is empirically fast). MAKESPEARE has also synthesized a record-setting program on one of the puzzles from the TIS-100 assembly language programming game.
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology > Software > Programming Languages (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
NAPS: Natural Program Synthesis Dataset
Zavershynskyi, Maksym, Skidanov, Alex, Polosukhin, Illia
We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed for working with real user-generated data. As a baseline we present few models, with the best model achieving 8.8% accuracy, showcasing both the complexity of the dataset and large room for future research.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Germany > Berlin (0.04)
Using the Crowd to Do Natural Language Programming
Manshadi, Mehdi (University of Rochester) | Keenan, Carolyn (University of Rochester) | Allen, James (University of Rochester)
Natural language programming has proven to be a very challenging task. We present a novel idea which suggests using crowdsourcing to do natural language programming. Our approach asks non-expert workers to provide input/output examples for a task defined in natural language form. We then use a Programming by Example system to induce the intended program from the input/output examples. Our early results are promising, encouraging further research in this area.
- Asia > Middle East > Jordan (0.06)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Comparison of three classification techniques: CART, C4.5 and Multi-Layer Perceptrons
In this paper, after some introductory remarks into the classification problem as considered in various research communities, and some discussions concerning some of the reasons for ascertaining the performances of the three chosen algorithms, viz., CART (Classification and Regression Tree), C4.5 (one of the more recent versions of a popular induction tree technique known as ID3), and a multi-layer perceptron (MLP), it is proposed to compare the performances of these algorithms under two criteria: classification and generalisation. It is found that, in general, the MLP has better classification and generalisation accuracies compared with the other two algorithms. 1 Introduction Classification of data into categories has been pursued by a number of research communities, viz., applied statistics, knowledge acquisition, neural networks. In applied statistics, there are a number of techniques, e.g., clustering algorithms (see e.g., Hartigan), CART (Classification and Regression Trees, see e.g., Breiman et al). Clustering algorithms are used when the underlying data naturally fall into a number of groups, the distance among groups are measured by various metrics [Hartigan]. CART [Breiman, et all has been very popular among applied statisticians. It assumes that the underlying data can be separated into categories, the decision boundaries can either be parallel to the axis or they can be a linear combination of these axes!. Under certain assumptions on the input data and their associated lIn CART, and C4.5, the axes are the same as the input features
- Oceania > Australia > Queensland (0.05)
- North America > United States > New York (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)